[SPARK-22602][SQL] remove ColumnVector#loadBytes#19815
[SPARK-22602][SQL] remove ColumnVector#loadBytes#19815cloud-fan wants to merge 1 commit intoapache:masterfrom
Conversation
ae7db88 to
3a59b32
Compare
|
Test build #84170 has finished for PR 19815 at commit
|
|
Test build #84172 has finished for PR 19815 at commit
|
|
I will look this Sunday. |
There was a problem hiding this comment.
Shall we add comment that getUTF8String reuse the data in column vector? It seems different than other getXXX APIs.
There was a problem hiding this comment.
hmm, but looks decodeToBinary will copy byte data?
There was a problem hiding this comment.
That seems orthogonal to this issue. It would be nice if we could avoid the copy though. That would require some work on the dictionary code path.
|
LGTM except one comment |
There was a problem hiding this comment.
It looks risky if we do not make a copy.
If we plan to avoid the unnecessary data copy by this API, we should rename the API name getUTF8String and check all the callers whether they do not break the assumption.
There was a problem hiding this comment.
This is a bit of a non-issue: the current on-heap code path already avoids making copies.
There was a problem hiding this comment.
It might be a bit better to use arrayData() instead of childColumns[0], they are practically the same, but it makes the intent a bit clearer.
There was a problem hiding this comment.
Move it to
//
// APIs dealing with Bytes
//
Any better names?
There was a problem hiding this comment.
Same here, use arrayData().
3a59b32 to
5711bb2
Compare
|
LGTM pending Jenkins. |
|
Test build #84203 has finished for PR 19815 at commit
|
|
Thanks! Merged to master. |
What changes were proposed in this pull request?
ColumnVector#loadBytesis only used as an optimization for reading UTF8String inWritableColumnVector, this PR moves this optimization toWritableColumnVectorand simplified it.How was this patch tested?
existing test